HappyDB: A Corpus of 100, 000 Crowdsourced Happy Moments

نویسندگان

  • Akari Asai
  • Sara Evensen
  • Behzad Golshan
  • Alon Y. Halevy
  • Vivian Li
  • Andrei Lopatenko
  • Daniela Stepanov
  • Yoshihiko Suhara
  • Wang Chiew Tan
  • Yinzhan Xu
چکیده

The science of happiness is an area of positive psychology concerned with understanding what behaviors make people happy in a sustainable fashion. Recently, there has been interest in developing technologies that help incorporate the findings of the science of happiness into users’ daily lives by steering them towards behaviors that increase happiness. With the goal of building technology that can understand how people express their happy moments in text, we crowd-sourced HappyDB, a corpus of 100,000 happy moments that we make publicly available. This paper describes HappyDB and its properties, and outlines several important NLP problems that can be studied with the help of the corpus. We also apply several state-of-the-art analysis techniques to analyze HappyDB. Our results demonstrate the need for deeper NLP techniques to be developed which makes HappyDB an exciting resource for follow-on research.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Overview of the 2nd International Competition on Wikipedia Vandalism Detection

The paper overviews the vandalism detection task of the PAN’11 competition. A new corpus is introduced which comprises about 30 000 Wikipedia edits in the languages English, German and Spanish as well as the necessary crowdsourced annotations. Moreover, the performance of three vandalism detectors is evaluated and compared to those of the PAN’10 competition. Vivien Petras and Paul Clough (Eds.)...

متن کامل

DeScript: A Crowdsourced Corpus for the Acquisition of High-Quality Script Knowledge

Scripts are standardized event sequences describing typical everyday activities, which play an important role in the computational modeling of cognitive abilities (in particular for natural language processing). We present a large-scale crowdsourced collection of explicit linguistic descriptions of script-specific event sequences (40 scenarios with 100 sequences each). The corpus is enriched wi...

متن کامل

Learning From Stories: Using Crowdsourced Narratives to Train Virtual Agents

In this work we introduce Quixote, a system that makes programming virtual agents more accessible to nonprogrammers by enabling these agents to be trained using the sociocultural knowledge present in stories. Quixote uses a corpus of exemplar stories to automatically engineer a reward function that is used to train virtual agents to exhibit desired behaviors using reinforcement learning. We sho...

متن کامل

A Corpus-based Approach to Finding Happiness

What are the sources of happiness and sadness in everyday life? In this paper, we employ ‘linguistic ethnography’ to seek out where happiness lies in our everyday lives by considering a corpus of blogposts from the LiveJournal community annotated with happy and sad moods. By analyzing this corpus, we derive lists of happy and sad words and phrases annotated by their ‘happiness factor.’ Various ...

متن کامل

Learning Sociocultural Knowledge via Crowdsourced Examples

Computational systems can use sociocultural knowledge to understand human behavior and interact with humans in more natural ways. However, such systems are limited by their reliance on hand-authored sociocultural knowledge and models. We introduce an approach to automatically learn robust, script-like sociocultural knowledge from crowdsourced narratives. Crowdsourcing, the use of anonymous huma...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1801.07746  شماره 

صفحات  -

تاریخ انتشار 2018